Pesquisa | Index Medicus Global

A multiscale feature extraction algorithm for dysarthric speech recognition / 生物医学工程学杂志

Jianxing ZHAO; Peiyun XUE; Jing BAI; Chenkang SHI; Bo YUAN; Tongtong SHI.

Journal of Biomedical Engineering ; (6): 44-50, 2023.

Artigo em Chinês | WPRIM | ID: wpr-970672

RESUMO

In this paper, we propose a multi-scale mel domain feature map extraction algorithm to solve the problem that the speech recognition rate of dysarthria is difficult to improve. We used the empirical mode decomposition method to decompose speech signals and extracted Fbank features and their first-order differences for each of the three effective components to construct a new feature map, which could capture details in the frequency domain. Secondly, due to the problems of effective feature loss and high computational complexity in the training process of single channel neural network, we proposed a speech recognition network model in this paper. Finally, training and decoding were performed on the public UA-Speech dataset. The experimental results showed that the accuracy of the speech recognition model of this method reached 92.77%. Therefore, the algorithm proposed in this paper can effectively improve the speech recognition rate of dysarthria.

Assuntos

Humanos , Disartria/diagnóstico , Fala , Percepção da Fala , Algoritmos , Redes Neurais de Computação

An acoustic-articulatory study of the nasal finals in students with and without hearing loss / 生物医学工程学杂志

Qing WANG; Jing BAI; Peiyun XUE; Xueying ZHANG; Pei FENG.

Journal of Biomedical Engineering ; (6): 198-205, 2018.

Artigo em Chinês | WPRIM | ID: wpr-687645

RESUMO

The central aim of this experiment was to compare the articulatory and acoustic characteristics of students with normal hearing (NH) and school aged children with hearing loss (HL), and to explore the articulatory-acoustic relations during the nasal finals. Fourteen HL and 10 control group were enrolled in this study, and the data of 4 HL students were removed because of their high pronunciation error rate. Data were collected using an electromagnetic articulography. The acoustic data and kinematics data of nasal finals were extracted by the phonetics and data processing software, and all data were analyzed by test and correlation analysis. The paper shows that, the difference was statistically significant ( <0.05 or <0.01) in different vowels under the first two formant frequencies (F1, F2), the tongue position and the articulatory-acoustic relations between HL and NH group. The HL group's vertical movement data-F1 relations in /en/ and /eng/ are same as NH group. The conclusion of this study about participants with HL can provide support for speech healing training at increasing pronunciation accuracy in HL participants.

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA